HSMVS: heuristic search for minimum vertex separator on massive graphs

In graph theory, the problem of finding minimum vertex separator (MVS) is a classic NP-hard problem, and it plays a key role in a number of important applications in practice. The real-world massive graphs are of very large size, which calls for effective approximate methods, especially heuristic search algorithms. In this article, we present a simple yet effective heuristic search algorithm dubbed HSMVS for solving MVS on real-world massive graphs. Our HSMVS algorithm is developed on the basis of an efficient construction procedure and a simple yet effective vertex-selection heuristic. Experimental results on a large number of real-world massive graphs present that HSMVS is able to find much smaller vertex separators than three effective heuristic search algorithms, indicating the effectiveness of HSMVS. Further empirical analyses confirm the effectiveness of the underlying components in our proposed algorithm.


INTRODUCTION
In graph theory, there exist a variety of well-known combinatorial optimization problems, which have extensive important real-world applications in practice (Wang et al., 2017;Li et al., 2017a;Wang et al., 2018a;Li, Li & Yin, 2019;Sun et al., 2023).Many effective algorithms have been proposed for solving these combinatorial optimization problems, and they achieve good performance on academic benchmarks (mainly randomly generated graphs and crafted graphs).Along with the rapid evolution of the Internet, the rapid growth of real-world networks has resulted in more massive graphs.These real-world massive graphs bring new challenges for practical solving, as existing algorithms usually become ineffective when dealing with them (Cai, 2015).The appearance of massive graphs urgently calls for efficient algorithms, since efficient algorithms for solving combinatorial optimization problems would bring much benefit in practice.
Given an undirected graph G = (V ,E), where each vertex v i ∈ V is associated with a positive integer c i as its cost, and a positive integer b (1 ≤ b ≤ 2/3|V |), which stands for the limitation size, a vertex separator C is a subset of V , whose removal partitions the remaining collection of vertices (i.e., V \ C) into two components, such the size of each component (i.e., the number of vertices in each component) is no greater than b.
The minimum vertex separator (MVS) problem is to find such a vertex separator with the smallest total cost in the given graph.In theory, the MVS problem, focusing on finding such a vertex separator of the minimum total cost has been proven to be NP-hard (Bui & Jones, 1992;Fukuyama, 2006).Besides its considerable importance in theory, the MVS problem is of great significance in practice: MVS has a broad range of useful applications in real-world practical applications, e.g., VLSI design, computational biology, parallel processing, and hyper-graph partitioning (Balas & de Souza, 2005;Evrendilek, 2008;Biha & Meurs, 2011;Kayaaslan et al., 2012;Benlic & Hao, 2013;Gomes et al., 2023), and MVS techniques has been utilized to quantify robustness in complex networks and detect network bottlenecks in communication networks (Montes-Orozco et al., 2022;Montes-Orozco et al., 2021;Zhang & Shao, 2015).
In this article, we present an efficient MVS heuristic search algorithm named HSMVS, which concentrates on only one simple yet effective vertex-selection heuristic.HSMVS first utilizes an efficient construction procedure to initialize the solution, and then applies a vertex-selection heuristic to modify the solution.The vertex-selection heuristic combines random walk and the approximate best selection strategy in an effective way to strike a good balance between intensification and diversification.In order to evaluate the effectiveness of our HSMVS algorithm, we conduct extensive experiments to empirically compare HSMVS against BLS,BLS-RLE and New_K-OPT on a broad range of real-world massive graphs.The experimental results present that HSMVS is able to find better solutions than BLS,BLS-RLE, and New_K-OPT on a large number of graphs.Also, we conduct further empirical evaluations to confirm the effectiveness of the random walk component and the approximate best selection component underlying the HSMVS algorithm.The remainder of this article is organized as follows.In 'Related Work', we give a brief review on MVS solving from the perspectives of both theory and practice.In 'Preliminaries', we provide necessary definitions, concepts and notations.In 'Heuristic Search Framework for Solving MVS', we present a simple heuristic search framework for solving MVS.In 'The HSMVS Algorithm', we propose a new heuristic search algorithm called HSMVS, and introduce the construction procedure and the modification heuristic of the algorithm in detail.In 'Experiments', extensive experiments comparing HSMVS against an effective breakout local search algorithm BLS and it's optimized version BLS-RLE and an improved K-OPT local search algorithm New_K-OPT on a wide range of real-world massive graphs are presented.In 'Discussions', we conduct more empirical evaluations to study the effectiveness of the underlying components in the HSMVS algorithm.In 'Conclusions and Future Work', we give the conclusions of this article and list the future work.

RELATED WORK
Minimum vertex separator is an important NP-hard combinatorial optimization problem in graph theory, and attracts more attentions from academia.Furthermore, this problem is  becoming increasingly important because it has shown to have real-world applications in practice.Thus, there exist a number of works which are devoted to solving MVS in either theory or practice.In this section, we give a brief review on MVS solving, and discuss MVS algorithms from the perspectives of both theory and practice.

Theoretical algorithms
Because MVS has proven to be NP-hard (Bui & Jones, 1992;Fukuyama, 2006), it seems impossible to design exact algorithms with the complexity of polynomial time.Thus, most theoretical works on MVS focused on designing approximation algorithms, which aims at improving the approximation ratio for this NP-hard combinatorial optimization problem.Leighton & Rao (1999) presented an approximation algorithm for MVS, which is based on linear programming, and the algorithm gives an approximation ratio of O(logn) for MVS.Then, Feige, Hajiaghayi & Lee (2008) developed an approximation algorithm for MVS, which is based on novel linear and semidefinite program relaxations, and obtained the approximation ratio of O(log √ opt ), where opt is the size of an optimal vertex separator.

Practical algorithms
Even though a number of great contributions have been made on the theoretical analysis of MVS solving, the performance of theoretical algorithms for MVS is still unsatisfactory in practice.As MVS has important applications in real-world situations, such as VLSI design, computational biology, etc (Balas & de Souza, 2005;Biha & Meurs, 2011;Benlic & Hao, 2013;Zhang & Shao, 2015;Dagdeviren, Akram & Farzan, 2019;Furini et al., 2022), a number of practical algorithms for MVS have been proposed.As mentioned in the 'Introduction', practical algorithms for MVS can be classified into two categories: exact algorithms and heuristic search algorithms.Exact algorithms are guaranteed to prove optimal solutions, but they may fail to return good-quality solutions within reasonable time on solving large-sized instances (Benlic & Hao, 2013).Heuristic search algorithms could not prove optimality for the solutions they find, but they are able to seek out good-quality solutions for large-sized instances efficiently.
Most previous works on practical MVS solving focus on designing and improving exact algorithms.In de Souza & Balas (2005), developed a branch-and-cut algorithm for MVS,  which is based on the mixed integer programming formulation (Balas & de Souza, 2005).
After that, Biha & Meurs (2011) designed an exact algorithm for MVS, on the basis of new classes of valid inequalities for the associated polyhedron.Further, de Souza & Cavalcante (2011) proposed a hybrid algorithm, which is built on a Lagrangian relaxation framework.
In the context of MVS solving by heuristic search, Benlic & Hao (2013) developed the first local search algorithm called BLS for solving MVS.In order to improve the    performance, BLS incorporates several sophisticated heuristics (including a greedy hillclimbing component, an adaptive perturbation mechanism, a hashing function and a jumping-magnitude determining component), which introduce six instance-dependent parameters.There exists an improved version of BLS, which is called BLS-RLE (Benlic, Epitropakis & Burke, 2017).BLS-RLE introduces an effective parameter control mechanism that draws upon ideas from reinforcement learning theory to reach an interdependent decision.According to the computational results reported in the literature (Benlic & Hao, 2013;Benlic, Epitropakis & Burke, 2017), BLS is able to handle graphs with up to 3,000 vertices and runs much faster than a number of high-performance exact algorithms and BLS-RLE exhibits its effectiveness in solving the MVS problem.Besides, Zhang et al. (2015) proposed an improved K-OPT local search algorithm named New_K-OPT.The experimental results reported in the literature (Zhang et al., 2015) show that New_K-OPT exhibits relatively better performance compared to variable neighborhood search, simulated annealing and Relax-and-Cut (de Souza & Cavalcante, 2011) on a number of graphs.

PRELIMINARIES
In this section, we give necessary backgrounds of the minimum vertex separator (MVS) problem.An undirected graph G = (V ,E) consists of a set of vertices V and a set of edges E ⊆ V × V , where each edge e is a pair of two different vertices in V .For an edge e = (u,v), we say that vertices u and v are the endpoints of edge e.Two different vertices are neighbors if and only if they both appear at the same edge.We use the notation ) ∈ E and u = v} to denote the set of v's all neighboring vertices.The degree of a vertex v is denoted as Given an undirected graph, where each vertex is associate with a positive integer as its cost, and a limitation size, a vertex separator is a subset of vertices, whose removal divides the remaining vertices into two disjoint components (i.e., there is no edge connected those two components), subject to the size of each component (i.e., the number of vertices in each component) smaller than the limitation size.In this article, we address the problem of finding such a vertex separator as small total cost as possible.
More formally, given an undirected graph G = (V ,E) with a cost c i corresponding to each vertex v i ∈ V and a positive integer b (1 ≤ b ≤ |V |) denoting the limitation size, the minimum vertex separator (MVS) problem is to find a partition which divides V into three disjoint subsets A, B and C, such that (i) A and B are non-empty; (ii) there is no edge The vertex separator C is feasible when the the first three constraints (i, ii and iii) are satisfied, and is optimal when all constraints are satisfied.In theory, the MVS problem with 0 ≤ b ≤ 2/3|V | has been proven to be NP-hard (Feige & Mahdian, 2006;Bui & Jones, 1992;Fukuyama, 2006).Since the empirical study on solving MVS (Benlic & Hao, 2013) demonstrates that it is computationally difficult to solve the MVS problem with b = 1.05|V | 2 , in this work we follow this setting, and in our major experiments (as presented in 'Experiments') b is set to 1.05|V | 2 accordingly.Moreover, we would like to note that, in 'Discussions' we conduct empirical evaluations with b = 0.6, so as to study the performance of our proposed algorithm under different values of b.
The concept of solution is very important in heuristic search algorithms.
In the MVS problem (where b denotes the limitation size), a partition s = {A,B,C}, which divides the vertex set V into three disjoint subsets and guarantees that the sizes of A and B are not greater than b, is called a solution.The cost of a is solution s, denoted as cost (s), is the sum of the cost c j of each v j ∈ C (i.e., cost (s) = v j ∈C c j ).Obviously, the less the value of cost (s) is, the better the quality of solution s is.Hence, the MVS problem aims to find a solution s of minimum cost.

HEURISTIC SEARCH FRAMEWORK FOR SOLVING MVS
As described in 'Introduction', heuristic search, especially local search, is a popular paradigm and recently has shown effectiveness on a variety of NP-hard combinatorial problems.The basic idea of local search is that, it firstly constructs a solution as the initial solution, and then iteratively applies heuristics, which modify the resulting solution, to improve the solution quality (which is the cost of the solution, as defined in 'Preliminaries').Obviously, because combinatorial problems are rather different from each other in nature, it is difficult to solve a specific problem by directly applying heuristics designed for other problems.Therefore, it is a challenge to design an effective heuristic search algorithm for solving a combinatorial problem.
BLS has introduced the first local search framework for MVS (Benlic & Hao, 2013).This framework is composed of several heuristics and thus is relatively complex.In this section, we introduce a simple heuristic search framework for MVS, in order to demonstrate the most essential parts in heuristic search algorithms for solving MVS.
The basic heuristic search framework for MVS is outlined in Algorithm 1 as described as follows.In the beginning, heuristic search calls the function Construct_Solution to generate a solution s as the initial solution, and the best solution s * is initialized as s (line 1).After the initialization, heuristic search conducts the search stage iteratively until the terminating criterion is reached (lines 2-5).In each search step, heuristic search modifies solution s by employing the function Modify_Solution (line 3); whenever a better solution with a smaller cost is found, the best solution s * is updated accordingly (line 4).After the search stage, the resulting solution s * is reported as the final solution (line 6).

THE HSMVS ALGORITHM
On the basis of the simple heuristic search framework in the preceding section, we develop a new heuristic search algorithm called HSMVS for solving MVS.In this section, we present the whole HSMVS algorithm in detail.According to the pseudo-code in Algorithm 1, it is clear that the functions Construct_Solution and Modify_Solution are the most crucial parts in this framework.Thus, we specify these two functions in our HSMVS algorithm.
In order to build an effective heuristic search algorithm, our HSMVS algorithm utilizes an efficient heuristic function named Construct_Solution to construct the initial solution.We outline the pseudo-code of the function Construct_Solution in Algorithm 2. We note that the construction procedure consists of an extending stage and a fixing stage, which are described as below.

The construction procedure
The extending stage: In the beginning, three vertex sets A, B and C are initialized as ∅ (line 1).Then, for each vertex v ∈ V , the function puts v into one of these three sets according to the following rules.The fixing stage: According to the rules in the extending stage, there might be a number of edges that connect some vertices in set A and their neighboring vertices in set B, which makes the resulting solution {A,B,C} infeasible.To construct a feasible solution, the function tries to move some vertices from set B to set C. For each vertex v ∈ B, the function checks whether there are neighboring vertices of v in set A; if this is the case, the function moves vertex v into set C in order to resolve the contradiction (lines 13-15).Finally, the function returns s = {A,B,C} as the solution.Example 1 To make readers better understand our proposed algorithm, we present an example here to demonstrate how our construction procedure constructs an initial solution in a high-level sense.Figure 1 illustrates an example graph, which has eight vertices and nine edges, and we assume that the cost of each vertex is 1, indicating that the costs of all vertices are the same.For the example graph in Fig. 1 , given the limitation size b of 4 (i.e., b = 4), once the extending stage is completed, an infeasible solution could be generated.For instance, assuming the constructed infeasible solution is comprised of A = {4,6,7}, B = {0,1,2,3,5} and C = ∅, since some vertices in set B (i.e., vertices 1, 2 and 5) have neighboring vertices in A, during the fixing stage, those vertices of 1, 2 and 5 would be moved from set B to set C, resulting in a feasible solution of A = {4,6,7}, B = {0,3} and C = {1,2,5}.
As stated in the literature (Cai, 2015), it is important to design low time-complexity function to generate the initial solution for massive graphs, because high time-complexity Solution in Algorithm 2. We note that the construction ng stage, which are described as below.In fact, most real-world massive graphs are sparse ones (Barabási & Albert, 1999;Eubank et al., 2004;Cai, 2015).Thus, in most cases, the complexity of our function Construct_Solution is usually lower than O(|V | 2 ), which indicates that our construction procedure is practical for a large number of real-world massive graphs.

The modification heuristic
The modification heuristic also plays a critical role in the HSMVS algorithm.An important issue of designing an effective modification heuristic is to strike a good balance between intensification and diversification (Li & Huang, 2005).Inspired by the success of twomode heuristic search algorithms in Boolean satisfiability solving (Balint & Fröhlich, 2010;Li & Li, 2012;Cai & Su, 2013;Luo, Su & Cai, 2014), we propose an effective two-mode modification heuristic named Modify_Solution in the context of MVS solving.Essentially, the heuristic Modify_Solution modifies the current solution by moving a vertex v from set C to the target set X ∈ {A,B} and resolving the contradictions by moving to set C those vertices, which are v's neighbors and are currently in the opposite set Y (Y = {A,B} \ X ).Clearly, the most important issue of the heuristic is to decide the moving vertex v and the target set X .
Before describing the details of Modify_Solution, we introduce the basic operation in the heuristic.The operation move(v,X ,Y ), where v ∈ C, X ∈ {A,B}, Y = {A,B} \ X , works as follows.It first moves vertex v from set C to set X .Then, for each vertex w ∈ Y , it checks whether w ∈ N (v); if this is the case, it moves w from set Y to set C to keep the solution legal.We also introduce two evaluating properties score A and score B , which are important metrics for evaluating the priority of vertices in set C. The formal definitions of score A and score B are given as follows (Definitions 1 and 2).Definition 1 Given a solution s = {A,B,C}, for each vertex v ∈ C, the property score A (v) is defined as the decrement in the cost (s) after executing the operation move (v,A,B).
Definition 2 Given a solution s = {A,B,C}, for each vertex v ∈ C, the property score B (v) is defined as the decrement in the cost (s) after executing the operation move(v,B,A).
Given a vertex v ∈ C, the evaluation properties score A and score B represent the benefit through performing the operations move (v,A,B) and move (v,B,A), respectively.Also, performing an operation with larger value of score A or score B would reduce the value of cost to the largest extent.Therefore, it is advisable to select and conduct an operation with large value of score A or score B .Example 2 For the example graph in Fig. 1, we assume that each vertex has the same cost of 1, and the current solution is s = {A,B,C}, where A = {4,6,7}, B = {0,3}, C = {1,2,5}.For vertex 2(2 ∈ C), if we move vertex 2 from set C to set A, because vertex 2 has no neighboring vertex in set B, the decrement in the cost (s) after executing the operation move ( 2,A,B) is 1, so the score A (2) is 1.If we move vertex 2 from set C to set B, since vertex 4 is the neighboring vertex of 2, and vertex 4(4 ∈ A) should be moved from set A to set C, then the decrement in the cost (s) after executing the operation move (2,B,A) is 0 (i.e., score B (2) is 0).After comparing score A (2) and score B (2), we can decide the suitable set to which vertex 2 should be moved.
These properties play important roles in the reconstruction of solutions and reduction of the cost.
We present the pseudo-code of the whole heuristic Modify_Solution in Algorithm 3, and describe it in detail.Our heuristic Modify_Solution switches between two modes, i.e., the random mode and the greedy mode, in order to strike a good balance between intensification and diversification.The function Modify_Solution activates which mode depending on a probability wp.With the probability wp, Modify_Solution works in the random mode (lines 1-5); otherwise (with the probability 1 − wp), Modify_Solution works in the greedy mode (lines 6-19).The procedures of the random mode and the greedy mode are described as follows.
The random mode: In this mode, the heuristic employs the random walk component to strengthen diversification.The random walk component first randomly selects a vertex v from set C, and then randomly picks a target set X from {A,B}.If set X is set A, then the heuristic performs the operation move(v,A,B); otherwise (set X is set B), the heuristic executes the operation move(v,B,A).
The greedy mode: In this mode, the heuristic applies the approximate best selection component to contribute to intensification, inspired by the success of Best from Multiple Selections (BMS) in the context of minimum vertex cover (Cai, 2015).The approximate best selection component first chooses t vertices from set C, and among these t vertices selects the vertex with the greatest score A , denoted as v A (lines 7-11).Then, the heuristic also chooses t vertices from set C, and among these t vertices selects the vertex with the greatest score B , denoted as v B (lines 12-16).Finally, the heuristic checks whether score A (v A ) is greater than score B (v B ); if this is the case, it executes the operation move(v A ,A,B); otherwise (score A (v A ) is not greater than score B (v B )), it executes the operation move(v B ,B,A).
Finally, our heuristic Modify_Solution denotes the resulting sets A, B, C as sets A , B and C , respectively, and then returns s = {A ,B ,C } as the resulting solution.
Example 3 In the greedy mode, we firstly calculate the values of score A and score B for each vertex in set C. Figure 2 shows the comparison of a solution before and after the movement.From Fig. 2, we can obtain score A (1) = 0, score B (1) = 0, score A (2) = 1, score B (2) = 0, score A (5) = 0, and score B (5) = 0. Since the vertex 2 is with the greatest score A , and also the greatest among all the value of score A and score B , our heuristic chooses vertex 2 and the set A, and then performs the operation move(2,C,A).

Remark:
We note that the solution s returned by the heuristic Modify_Solution might be infeasible.If this is the case, the algorithm would first rollback the resulting solution s = {A ,B ,C } to s = {A,B,C}, and then randomly moves a vertex from set A to set C (or moves a vertex from set B to set C).

EXPERIMENTS
In order to show the effectiveness of our HSMVS algorithm, we compare HSMVS against an effective breakout local search algorithm BLS and its optimized version BLS-RLE and an improved K-OPT local search algorithm New_K-OPT on a broad range of real-world massive graphs.In this section, we first introduce the benchmarks, the competitors and the experimental setup of our experiments.Then we report the comparative results.

The benchmarks
We evaluate HSMVS on all 139 graphs collected in a public and standard graph I benchmark (https://lcs.ios.ac.cn/~caisw/graphs.html), which is originally collected from Network Repository (Rossi & Ahmed, 2015a;Rossi & Ahmed, 2015b) and consists of a broad range of real-world massive simple undirected graphs.Most of these graphs are encoded from real-world applications.In practice, these real-world massive graphs have been utilized in testing practical algorithms for well-known NP-hard combinatorial optimization problems in graph theory, including minimum vertex cover (Luo et al., 2019;Li et al., 2020), minimum dominating set (Chen et al., 2023), maximum clique (Rossi et al., 2014) and graph coloring (Rossi & Ahmed, 2014).The graphs tested in our experiments contain a variety of real-world networks, and can be classified into 12 categories, including biological networks, collaboration networks, Facebook networks, infrastructure network, interaction networks, recommendation networks, Retweet networks, scientific computing, social networks, technological networks, temporal reachability networks and web graphs.For these graphs evaluated in our experiments, all the vertices are given unit weights, and b = 1.05|V | 2 recalling that b can be regarded as the limitation size and firstly introduced in Section 'Preliminaries'.These benchmarking settings are suggested by the literature (Benlic & Hao, 2013).
• The BLS solver (Benlic & Hao, 2013) is the first local search solver for solving the MVS problem, and it achieves effectiveness in solving MVS instances.According to the experiments in the literature (Benlic & Hao, 2013), BLS performs significantly better than a number of high-performance exact solvers (de Souza & Balas, 2005;Biha & Meurs, 2011).As reported in the literature (Hager & Hungerford, 2015), BLS exhibits its effectiveness in solving MVS on random graphs.
1 For each solver run, the corresponding solver would report a final solution.For a solver on a graph, since each solver performs 10 runs, then there would be 10 reported solutions in total, and the best solution is the solution with the smallest cost among all 10 solutions.The best solution quality among all 10 runs is the cost of the best solution.
• The New_K-OPT solver (Zhang & Shao, 2015) is a high-performance, improved K-OPT local search solver for solving the MVS problem.The experimental results reported in the literature (Zhang & Shao, 2015) show that New_K-OPT exhibits relatively better performance compared to several methods, such as variable neighborhood search, simulated annealing and Relax-and-Cut (de Souza & Cavalcante, 2011), on a number of graphs.
• The BLS-RLE solver (Benlic, Epitropakis & Burke, 2017) is an enhancement of BLS.The BLS-RLE solver introduces a new parameter control mechanism, which is designed on the basis of the reinforcement learning theory.As claimed by its authors, this new parameter control mechanism could help the BLS-RLE solver better escape from the local optimum situation.According to the experimental results demonstrated in the literature (Benlic, Epitropakis & Burke, 2017), BLS-RLE performs much better than BLS on various types of graphs.

Experimental setup
Our HSMVS algorithm is implemented in the programming language C++.In our experiments, for HSMVS, the parameter p is set to 0.5, as the initialization should be uniformly random; the parameter wp is set to 0.05 and the parameter t is set to 20 according to preliminary experiments.The local search competitor BLS is an open-source solver and can be downloaded online (http://www.info.univ-angers.fr/pub/hao/BLSVSP/Code/BLS_VSP.cpp).The BLS solver is implemented in the programming language C++.For BLS, we adopt the parameter settings which are reported in the literature (Benlic & Hao, 2013).The BLS solver is implemented in the programming language C++.For BLS-RLE, its implementation is publicly available online.(http://www.epitropakis.co.uk/BLS-RLE/) The BLS-RLE is implemented in the programming language C++, and it is evaluated using the configuration settings that are utilized in the literature (Benlic, Epitropakis & Burke, 2017).The source codes of the improved K-OPT local search competitor New_K-OPT is kindly provided by its author.The New_K-OPT solver is implemented in the programming language C++.For New_K-OPT, we adopt the algorithmic settings which are reported in the literature (Zhang & Shao, 2015).In order to make the empirical comparison fair, all these three algorithms HSMVS, BLS, BLS-RLE and New_K-OPT are statically complied by the compiler g++ with the option '-O3'.
All the experiments are carried out on a number of workstations equipped with Intel Xeon E7-8830 2.13 GHz CPU, 24MB L3 cache and 1.0TB RAM under the operating system CentOS 7.0.1406.In our experiments, each solver performs 10 runs on each graph.The cutoff time of each run performed by each solver is set to 1,000 s.
For each graph, we report the best solution quality found by each solver among all 10 runs, denoted by 'best' 1 , the average solution quality over all 10 runs, denoted by 'avg.', and the average run time of reporting the best solution in each run, denoted by 'time'.If a solver fails to report solutions on a graph within the cutoff time among all 10 runs, we mark 'N/A' for 'best', 'avg.' and 'time' for the related solver on the related graph.
Furthermore, for each solver on each graph class, we report the number of graphs where the solver finds the best solution quality among all competing solvers in the related

Experimental results
In this subsection, we first present the experimental results, and then conduct some discussions about the results.
The comparative results of HSMVS and its competitors BLS, BLS-RLE, New_K-OPT on all real-world massive graphs are reported in Tables 1-7, where Table 1 presents the comparative results on the graph classes of biological networks and collaboration networks, Table 2 presents the comparative results on the graph classes of Facebook networks and infrastructure networks, Table 3 presents the comparative results on the graph classes of interaction networks, recommendation networks, Retweet networks and scientific computing, Table 4 presents the comparative results on the graph classes of social networks and technological networks, Table 5 presents the comparative results on the graph class of temporal reachability networks, Table 6 presents the comparative results on the graph class of web graphs, and Table 7 summarizes the comparative results on all massive real-world graphs.
First we focus on the comparison between HSMVS and BLS.According to the results reported in Tables 1-7, among all 12 graph classes, it is apparent that our HSMVS algorithm performs better than BLS on 9 graph classes (i.e., biological networks, collaboration networks, facebook networks, infrastructure networks, retweet networks, scientific computing, social networks, technological networks and web graphs).On the overall performance, according to Table 7, among all 139 real-world massive graphs, our HSMVS algorithm finds the best solution quality for 96 of them, while BLS does that for only 55 of them; HSMVS finds the best average solution quality for 89 of them, while this figure for BLS is only 52.
Then we concentrate on the evaluation between HSMVS and New_K-OPT.According to the results reported in Tables 1-7, it is clear that HSMVS significantly outperforms New_K-OPT on all 12 graph classes.On the overall performance, seen from Table 7, HSMVS gives the best solution quality for 96 of them, while this figure for New_K-OPT is only 3; HSMVS finds the best average solution quality for 89 of them, while this figure for New_K-OPT is only 2.
Finally, we analyze the comparison between HSMVS and BLS-RLE.According to the results reported in Tables 1-7, our HSMVS algorithm performs better than BLS-RLE on 7 graph classes (i.e., biological networks, collaboration networks, infrastructure networks, retweet networks, social networks, technological networks, and web graphs).On the overall

Remark:
The experimental results on a broad range of real-world massive graphs in Tables 1-7 present that HSMVS generally performs much better than the effective breakout local search competitor BLS, the improved K-OPT local search competitor New_K-OPT, and the enhanced version of BLS named BLS-RLE, on a large number of real-world massive graphs, indicating that HSMVS shows its superiority on solving real-world massive graphs.

DISCUSSIONS
In this section, we conduct empirical evaluations to further discuss the effectiveness of HSMVS.In particular, we first perform ablation studies to demonstrate the effectiveness of algorithmic components (i.e., the approximate best selection component and the random walk component) underlying HSMVS.Then, we analyze the performance of HSMVS on different limitation size.Finally, we discuss the advantage of HSMVS when compared to its competitors.

Effectiveness of algorithmic components underlying HSMVS
According to the description of the HSMVS algorithm, it is obvious that the approximate best selection component in the greedy mode and the random walk component in the random mode are the key parts.In order to show the effectiveness of these two components, we develop three alternative versions of HSMVS, which are all modified from HSMVS and are described as follows.• HSMVS_alt3: This version does not utilize the random walk component, i.e., working without the random mode (deleting lines 1-5 in Algorithm 3).In another word, HSMVS_alt3 could be considered as a specific version of HSMVS with parameter wp = 0.
Then, we conduct extensive empirical evaluations to compare HSMVS with its three alternative versions on the all 139 real-world massive graphs.The experimental setup used in this comparison is the same one used in 'Experiments'.To make the evaluation fair, all  these alternative versions are also implemented in C++, and are statically compiled by g++ with the option '-O3'.Furthermore, the parameters settings used in these three alternative versions are the same as in HSMVS.
Table 8 reports the related empirical results of comparing the HSMVS algorithm with all its alternative versions (i.e., HSMVS_alt1, HSMVS_alt2 and HSMVS_alt3) on all 139 real-world massive graphs.As can be seen from Table 8, it is clear that HSMVS stands out as the general best solver in this comparison.Particularly, HSMVS performs much better than all its alternative versions in terms of both the best solution quality and the average solution quality.Among 139 total real-world massive graphs, HSMVS finds the best solution quality for 85 of them, while this figure is only 64, 42 and 71 for HSMVS_alt1, HSMVS_alt2 and HSMVS_alt3, respectively; HSMVS finds the best average solution quality for 83 of them, while this figure is only 56, 35 and 67 for HSMVS_alt1, HSMVS_alt2 and HSMVS_alt3, respectively.
Remark: The empirical results presented in Table 8 show that HSMVS generally performs better than all its alternative versions and thus is the general best algorithm on the real-world massive graphs, which confirms the effectiveness of the approximate best component and the random walk component.

Experiment results on different limitation size
In this subsection, we conduct empirical evaluations to assess the performance of HSMVS on different limitation size.In particular, compared to the setting of limitation size (b = 1.05|V | 2 ) that is adopted in Section 'Experiments', here we set the limitation size to b = 0.6|V | .Also, in this subsection we conduct empirical evaluations on 12 selected graphs, where we randomly select a graph from each graph class.Table 9 reports the comparative results of HSMVS and its competitors on 12 selected graphs with b = 0.6|V | , and Table 10 summarizes the overall results on those 12 selected graphs with b = 0.6|V | .As can be observed from Tables 9 and 10, our HSMVS algorithms still performs generally better than its competitors (i.e., BLS, BLS-RLE and New_K-OPT ).According to Table 10, HSMVS gives the best solution quality for 9 of the overall selected graphs, while this figure for BLS, New_K-OPT and BLS-RLE is 0, 0 and 3, respectively.Also, HSMVS finds the best average solution quality for 8 of them, while this figure for BLS, New_K-OPT and BLS-RLE is 0, 0 and 4, respectively.In summary, HSMVS achieves generally better performance than its competitors on a different limitation size (i.e., b = 0.6|V | ).

Discussion on the advantage of HSMVS
As presented in Tables 1-7, there is no single best algorithm across all classes of graphs.Hence, in this subsection, we aim to discuss the advantage of HSMVS when compared to its competitors.Particularly, we analyze the experimental results and the features of graphs,  for identifying the characteristics of graphs which HSMVS exhibits better effectiveness than its competitors.Figure 3 illustrates the relationship between the practical performance of competing algorithms (including HSMVS and its competitors) and the size of graphs (i.e., the number of graphs' vertices).In Fig. 3, the X -axis depicts ln(|V |), where |V | represents the number of vertices, while the Y -axis presents ln(|avg | + 1), where |avg | denotes the corresponding algorithm's obtained average solution quality over all 10 runs.It can be observed that our HSMVS algorithm shows competitive performance on graphs with relatively large number of vertices.As discussed in 'The HSMVS Algorithm', our HSMVS algorithm strikes a good balance between intensification and diversification.In this way, when handling graphs with relatively large numbers of vertices, compared to its  Table 8 reports the related empirical results of comparing the HSMVS algorithm with all its alternative versions (i.e., HSMVS alt1, HSMVS alt2 and HSMVS alt3) on all 139 real-world massive graphs.As can be seen from Table 8, it is clear that HSMVS stands out as the general best solver in this comparison.
Particularly, HSMVS performs much better than all its alternative versions in terms of both the best solution quality and the average solution quality.Among 139 total real-world massive graphs, HSMVS finds the best solution quality for 85 of them, while this figure is only 64, 42 and 71 for HSMVS alt1, HSMVS alt2 and HSMVS alt3, respectively; HSMVS finds the best average solution quality for 83 of them, while this figure is only 55, 35 and 67 for HSMVS alt1, HSMVS alt2 and HSMVS alt3, respectively.

Remark:
The empirical results presented in Table 8 show that HSMVS generally performs better than all its alternative versions and thus is the general best algorithm on the real-world massive graphs, which confirms the effectiveness of the approximate best component and the random walk component.

Experiment Results on Different Limitation Size
In this subsection, we conduct empirical evaluations to assess the performance of HSMVS on different limitation size.In particular, compared to the setting of limitation size (b = + 1.05|V | 2 ,) that is adopted in Section 6, here we set the limitation size to b = +0.6|V|,.Also, in this subsection we conduct empirical evaluations on 12 selected graphs, where we randomly select a graph from each graph class.Table 9 reports the comparative results of HSMVS and its competitors on 12 selected graphs with b = +0.6|V|,, and Table 10 summarizes the overall results on those 12 selected graphs with b = +0.6|V|,.As can be observed from Tables 9 and 10, our HSMVS algorithms still performs generally better than its competitors (i.e., BLS, BLS-RLE and New K-OPT).According to Table 10, HSMVS gives the best solution quality for 9 of the overall selected graphs, while this figure for BLS, New K-OPT and BLS-RLE is 0, 0 and 3, respectively.Also, HSMVS finds the best average solution quality for 7 of them, while this figure for BLS, New K-OPT and BLS-RLE is 0, 0 and 5, respectively.In summary, HSMVS achieves generally better performance than its competitors on a different limitation size (i.e., b = +0.6|V|,).competitors, our HSMVS algorithm is able to explore a broader solution space in a shorter time, resulting in an advantage on larger-scale graphs.

Algorithm 2
The Function Construct_Solution Input: Graph G = (V ,E), limitation size b; Output: A solution s = {A,B,C}; 1: Initialized three vertex set A,B,C to φ; 2: foreach vertex v ∈ V do 3: if with probability p then 4: if |A| < b then put v into set A; 5: else if |B| < b then put v into set B; |B| < b then put v into set B; 9: else if |A| < b then put v into set A; 10: else then put v into set C; 11: end if 12: end foreach 13: foreach vertex v ∈ B do 14: if N (v) ∩ A = φ then move v from B to C; 15: end foreach 16: return s = {A,B,C};

•
With probability p, if |A| < b, the function puts v into A; if |A| ≥ b and |B| < b, the function puts v into B; if |A| ≥ b and |B| ≥ b, the function puts v into C (lines 3-6).• Otherwise (with probability 1 − p), if |B| < b, the function puts v into B; if |B| ≥ b and |A| < b, the function puts v into A; if |B| ≥ b and |A| ≥ b, the function puts v into C (lines 7-11).

Figure 2
Figure 2 Example figures to demonstrate movement.We note that the vertices in red color, yellow color and blue color are the vertices in subsets A, B and C, respectively.Full-size DOI: 10.7717/peerjcs.2013/fig-2

Figure 3 .
Figure 3. Practical performance of competing algorithms (including HSMVS and its competitors) on graphs with different sizes.

Luo and Guo (2024), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.2013 11/28 Algorithm
3 The Function Modify_Solution Input: A source solution s = {A,B,C}, limitation size b; Output: A modified solution s = {A ,B ,C }; 1: if with a probability wp then 10:if score A (r A ) > score A (v A ) then v A ← r A ; B ← a random vertex from set C; 13: for i ← 1 to t − 1 do 14: r B ← a random vertex from set C; 15: if score B (r B ) > score B (v B ) then v B ←r B ; 16: end for 17: if |A| == b then move(v B ,B,A); 18: else then move(v A ,A,B); 19: if score A (v A ) > score B (v B ) then move(v A ,A,B); 20: else then move(v B ,B,A); 21: end if 22: A ← A, B ← B, C ← C; 23: return s = {A ,B ,C };

Luo and Guo (2024), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.2013 17/28 experiment
, denoted by '#best', the number of graphs where the solver finds the best average solution quality among all competing solvers in the related experiment, denoted by '#avg.',and the average time of reporting the best solution in each run, denoted by 'time'.If a solver fails to report solutions on all graphs in a graph class, we mark 'N/A' for 'time' for the related solver on the related graph class.The number of graphs in each graph class is indicated in the column '#graph'.This form of demonstrating experimental results is inspired by the rules of wellknown SAT competitions (http://www.satcompetition.org/)and MAX-SAT evaluations (http://www.maxsat.udl.cat/).

Luo and Guo (2024), PeerJ Comput. Sci., DOI 10.7717/peerj-cs.2013 18/28 performance
, according to Table7, among all 139 real-world massive graphs, our HSMVS algorithm finds the best solution quality for 96 of them, while BLS-RLE does that for only 83 of them; HSMVS finds the best average solution quality for 89 of them, while this figure for BLS is only 78.

• HSMVS_alt1 :
This version uses the strict best selection component instead of the approximate best selection component.HSMVS_alt1 differs from HSMVS in lines 7-16 in Algorithm 3: in lines 7-11, HSMVS_alt1 greedily selects the variable with the greatest score A from set C, denoted v A ; in lines 12-16, HSMVS_alt2 greedily selects the variable with the greatest score B from set C, denoted as v B .This version uses the random selection component instead of the approximate best selection component.HSMVS_alt2 also differs from HSMVS in lines 7-16 in Algorithm 3: in lines 7-11, HSMVS_alt2 randomly selects a variable from set C, denoted as v A ; in lines 12-16, HSMVS_alt2 randomly selects a variable from set C, denoted as v B .In another word, HSMVS_alt2 could be considered as a specific version of HSMVS with parameter t = 1.

Table 9 Results on 12 selected graphs with
b = 0.6|V | .